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Abstract — We investigate two source coding problems with 
secrecy constraints. In the first problem we consider real-time 
fully secure transmission of a memoryless source. We show that 
although classical variable-rate coding is not an option since the 
lengths of the codewords leak information on the source, the key 
rate can be as low as the average Huffman codeword length of the 
source. In the second problem we consider causal source coding 
with a fidelity criterion and side information at the decoder 
and the eavesdropper. We show that when the eavesdropper has 
degraded side information, it is optimal to first use a causal rate 
distortion code and then encrypt its output with a key. 

I. Introduction 

We consider two source coding scenarios in which an 
encoder, referred to as Alice, transmits outcomes of a memory- 
less source to a decoder, referred to as Bob. The comunnication 
between Alice and Bob is intercepted by an eavesdropper, 
referred to as Eve. 

In the first scenario, we consider real-time communication 
between Alice and Bob and require full secrecy, meaning that 
the intercepted transmission does not leak any information 
about the source. In the second scenario, we consider lossy 
causal source coding when both Bob and Eve have access to 
side information (SI). We require that Eve's uncertainty about 
the source given the intercepted signal and SI will be higher 
than a certain threshold. 

Real-time codes are a subclass of causal codes, as defined 
by Neuhoff and Gilbert |r|. In fl], entropy coding is used 
on the whole sequence of reproduction symbols, introducing 
arbitrarily long delays. In the real-time case, entropy coding 
has to be instantaneous, symbol-by-symbol (possibly taking 
into account past transmitted symbols). It was shown in HI, 
that for a discrete memoryless source (DMS), the optimal 
causal encoder consists of time-sharing between no more than 
two memoryless encoders. Weissman and Merhav [|2] extended 
0] by including SI at the decoder, encoder or both. 

Shannon |3| introduced the information-theoretic notion 
of secrecy, where security is measured through the remain- 
ing uncertainty about the message at the eavesdropper This 
information-theoretic approach of secrecy allows to consider 
security issues at the physical layer, and ensures uncondition- 
ally (regardless of the eavesdroppers computing power and 
time) secure schemes, since it only relies on the statistical 
properties of the system. Wyner introduced the wiretap channel 
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in 01 and showed that it is possible to send information at a 
positive rate with perfect secrecy as long as Eve's channel is a 
degraded version of the channel to Bob. When the channels are 
clean, two approaches can be found in the literature of secure 
communication. The first assumes that both Alice and Bob 
agree on a secret key prior to the transmission of the source 
(through a separate secure channel for example). The second 
approach assumes that Bob and Eve (and possibly Alice) have 
different versions of side information and secrecy is achieved 
through this difference. 

For the case of shared secret key. Shannon showed that in 
order for the transmission of a DMS to be fully secure, the rate 
of the key must be at least as large as the entropy of the source. 
Yamamoto (Jll and references therein) studied various secure 
source coding scenarios that include extension of Shannon's 
result to combine secrecy with rate-distortion theory. In both 
El, HI, when no SI is available, it was shown that separation is 
optimal. Namely, using a source code followed by encryption 
with the shared key is optimal. The other approach was 
treated more recently by Prabhakaran and Ramchandran ||6] 
who considered lossless source coding with SI at both Bob 
and Eve when there is no rate constraint between Alice and 
Bob. It was shown that the Slepian-Wolf Q scheme is not 
necessarily optimal when the SI structure is not degraded. 
Coded SI at Bob and SI at Alice where considered in [8|. 
These works were extended by Villard and Piantanida @ to 
the case where distortion is allowed and coded SI is available 
to Bob. Merhav combined the two approaches with the wire- 
tap channel ifTOl . Note that we mentioned only a small sample 
of the vast literature on this subject. 

In the works mentioned above, there were no constraints 
on the delay and/or causality of the system. As a result, the 
coding theorems of the above works introduced arbitrary long 
delay and exponential complexity. 

The practical need for fast and efficient encryption algo- 
rithms for military and commercial applications along with 
theoretical advances of the cryptology community, led to the 
development of efficient encryption algorithms and standards 
which rely on relatively short keys. However, the security of 
these algorithms depend on computational complexity and the 
intractability assumption of some hard problems. To the best 
of our knowledge, there was no attempt so far to analyze the 
performance of a real-time or causal secrecy system from an 
information theoretic point of view. 

The extension of Neuhoff and Gilbert's result [11 to the 



real-time case is straightforward and is done by replacing 
the block entropy coding by instantaneous Huffman coding. 
The resulting bitstream between the encoder and decoder is 
composed of the Huffman codewords. However, this cannot 
be done when secrecy is involved, even if only lossless 
compression is considered. To see why, consider the case 
where Eve intercepts a Huffman codeword and further assume 
the bits of the codeword are encrypted with a one-time pad. 
While the intercepted bits give no information on the encoded 
symbol (since they are independent of it after the encryption), 
the number of intercepted bits leaks information on the source 
symbol. For example, if the codeword is short. Eve knows that 
the encrypted symbol is one with a high probability (remember 
that Eve knows the source statistics). This suggests that in 
order to achieve full security, the lengths of the codewords 
emitted by the encoder should be independent of the source. 

In the last example, we assumed that Eve is informed on 
how to parse the bitstream into separate codewords. This will 
be the case, for example, when each codeword is transmitted 
as a packet over a network and the packets are intercepted by 
Eve. Even if the bits are meaningless to Eve, she still knows 
the number of bits in each packet. We show in the sequel that, 
albeit the above example, the key rate can be as low as the 
average Huffman codeword length (referred hereafter as the 
Huffman length) of the source. Full secrecy, in this case, will 
be achieved by randomization at the encoder, which can be 
removed by Bob. In contrast to the works mentioned above, 
our results here are not asymptotic. 

We also investigate the scenario where Eve doesn't have 
parsing information and cannot parse the bitstream into the 
separate codewords. This will be the case, for example if Eve 
acquires only the whole bitstream, not necessarily in real-time, 
without the log of the network traffic. Alternatively, it acquires 
an encrypted file after it was saved to the disk. In this case, 
when we assume that the length of transmission is infinite, we 
show that that the best achievable rates of both the key and the 
transmission are given by the Huffman length of the source. 
In contrast to the results described in the previous paragraph, 
the results in this scenario are asymptotic in the sense that 
the probability that the system is not secure is zero when 
the transmission length is infinite. Note that the length of the 
transmission was not an issue in the mentioned previous works 
since block coding was used. Therefore, the block length was 
known a-priori to Eve and leaked no information. 

In the following two sections we deal with the real-time and 
causal setting, receptively. Each section begins with a formal 
definition of the relevant problem. 

II. Real-Time Full Secrecy 

We begin with notation conventions. Capital letters repre- 
sent scalar random variables (RV's), specific reaUzations of 
them are denoted by the corresponding lower case letters 
and their alphabets - by calligraphic letters. For i < j (i, 
j - positive integers), will denote the vector {xi, . . . , Xj), 
where for i = 1 the subscript will be omitted. For two random 
variables X,Y, with alphabets X,y, respectively and joint 



probability distribution {p{x,y)}, the average instantaneous 
codeword length of X conditioned onY = y wiU be given by 

LiX\Y = y)^ ^ mm^ | ^ . (1) 

where Ax is the set of all possible length functions I : 
X Z+ that satisfy Kraft's inequality for alphabet of size 
\X\. L[X\Y = y) is obtained by designing a Huffman code 
for the probability distribution P{x\y). With the same abuse 
of notation common for entropy, we let L{X\Y) denote the 
expectation of L[X\Y = y) with respect to the randomness 
of Y. The Huffman length of X is given by L{X). 

In this section, the following real-time source coding prob- 
lem is considered: Alice, wishes to losslessly transmit the 
output of a DMS X with probabiUty mass function Px{x) 
to Bob. The communication between Alice and Bob is inter- 
cepted by Eve. Ahce and Bob operate without delay. When 
Alice observes Xt she encodes it by an instantaneous code 
and transmits the codeword to Bob through a clean digital 
channel. Bob decodes the codeword and reproduces Xt. A 
conmiunication stage is defined to start when the source emits 
Xt and ends when Bob reproduces Xt, i.e.. Bob cannot use 
future transmissions to calculate Xt. We will assume that both 
Alice and Bob have access to a completely random binary 
sequence, u = (ui,U2,...), which is independent of the 
data and will be referred to as the key. Let mi, m2, . . . , m„, 
mj e N be a non decreasing sequence of positive integers. At 
stage t, Alice uses = rrit — rrit-i bits that were not used 
so far from the key sequence. Let Ki = (Mmt_i+i, • • • ,Wmt) 
denote the stage t key. The parsing of the key sequence up 
to stage t should be the same at Alice and Bob. This can be 
done "on the fly" through the data already known to both Alice 
and Bob from the previous stages. We define the key rate to 
be Rk = lim sup„_^g^ ^ 127=i ^^Kt ■ We will also assume 
that Alice has access, at each stage, to a private source of 
randomness {Vt}, which is i.i.d and independent of the source 
and the key. Neither Bob nor Eve have access to {Vt}. 

Let Z be the set of all finite length binary strings. Denote 
Alice's output at stage t by Zt ^ Z and let Bt denote the 
unparsed sequence, containing bits, that were transmitted 
so far up to the end of stage t. The rate of the encoder is 

defined by R = limsup„_^o<3 ^EIb„. 

Given the keys up to stage t, K*, Bob can parse Bk into 
Zi,. . . ,Zi for any k > t. The legitimate decoder is thus a 
sequence functions Xt = gt{K* , Z*). 

As discussed in the Introduction, we will treat two security 
models. In the first model we will assume that Eve can detect 
when each stage starts, i.e., it can parse Bt into Zi,. . . ,Zt. 
In the second model, we will assume that Eve intercepts the 
whole bitstream _B„ (assuming a total of n stages) but has no 
information on actual parsing of Bn into ^i, . . . , Z„. These 
models are treated in the following two subsections. 

A. Eve Has Parsing Information 

In this subsection we assume that Eve can parse i3„ 
into Zx,Z2,. ■ ■ ,Zn. In order for the system to be fully 



secure, following |[3), we will require that for any k,m,n, 
P(X'^|Z"j) = P{X^), i.e., acquiring any portion of the 
transmission leaks no information on the source, which was 
not known to Eve in advance. 

The most general real-time encoder is a sequence of func- 
tions Zt — ft{K'^,Vt,X*). In this paper, we will treat only a 
subclass of encoders that satisfy the Markov chain 

o o K^-\ (2) 

Namely, given the past and current encoder outputs, the current 
source symbol, Xf , does not reduce the uncertainty regarding 
the past keys. We claim that this constraint, in the framework 
of complete security is relatively benign and, in fact, any 
encoder that calculates a codeword (possibly using the whole 
history of the source and keys, i.e., with the most general 
encoder structure), say Zt, and then outputs Zt — Zt ® Kt 
will satisfy this constraint. Such a structure seems natural for 
one-time pad encryption. Another example of encoders that 
will satisfy such a constraint are encoders with the structure 
Zt — ft{Kt,Vt, Xt, Z*^^) (we omit the proof this structure 
will induce the Markov chain due to space limitations). The 
main result of this subsection is the following theorem: 

Theorem I. There exists a pair of fully secure real-time 
encoder and decoder if and only if Rk > L{X). 

This theorem is in the spirit of the result of fi\, where 
the entropy is replaced by the Huffman length due to the real- 
time constraint. As discussed in the introduction, variable-rate 
coding is not an option when we want the communication to 
be fully secure. This means that the encoder should either 
output constant length (short) blocks or have the transmission 
length independent of the source symbol in some other way. 
Clearly, with constant length blocks, the rate of a lossless 
encoder cannot be as low as L{X) for all possible memoryless 
sources. The rate of the key, however, can be as low as L{X). 
In the proof of the direct part of Theorem |I] we show that a 
constant rate encoder with block length corresponding to the 
longest Huffman codeword achieves this key rate. The padding 
is done by random bits from the encoder's private source 
of randomness. Note, however, that if both the key rate and 
encoder rate are log \X\, lossless fully secure communication 
is trivially possible. Although Theorem I] does not give a lower 
bound on the rate of the encoder, the above discussion suggests 
that there is a trade-off between the key rate and the possible 
encoder rate that will allow secure lossless communication. 
Namely, there is a set of optimal rate pairs, (_R, Rk), which 
are possible. We prove Theorem|I]in the following subsections. 

1 ) Converse: For every lossless encoder-decoder pair that 
satisfies the security constraint and ©, we lower bound the 
key rate as follows: 

n n 
n 

>Y,L{Kt\K'-\Z') (3) 



= Y,L{Kt,Xt\K'-\Z') (4) 

n 

>Y,L{Xt\K'-\Z') (5) 

n 

= Y,L{Xt\Z') (6) 

n 

t=i 

= nL{X). (8) 

The first equaUty is true since the key bits are incompressible 
and therefore the Huffman length is the same as the number 
of key bits. ^ is true since conditioning reduces the Huffman 
length (the simple proof of this is omitted). (|4|l follows since 
Xt is a function of [K'^'^Z^) (the decoder's function) and 
therefore, given {K*~^, Z*), the code for Kt also reveals Xt- 
Q is true since with the same conditioning on (iir*~^,Z*), 
the instantaneous code of {Kt,Xt) cannot be shorter then 
the instantaneous code of Xt- © is due to (|2]i and finally, 
d?) is true by the security model. We therefore showed that 
Rk > L{X). 

2) Direct: We construct an encoder-decoder pair that are 
fully secure with R^ — L{X). Let Imax denote the longest 
Huffman codeword of X. We know that Imax ^ l-'^l ~ 1- The 
encoder output will always be Imax bits long and will be built 
from two fields. The first field will be the Huffman codeword 
for the observed source symbol Xt- Denote its length by l{Xt). 
This codeword is then XORed with l{Xt) key bits. The second 
field will be composed of Imax— I i-^t) random bits (taken from 
the private source of randomness) that will pad the encrypted 
Huffman codeword to be of length Imax- Regardless of the 
specific source output. Eve sees constant length codewords 
composed of random uniform bits. Therefore no information 
about the source is leaked by the encoder outputs. When Bob 
receives such a block, it starts XORing it with key bits until 
it detects a valid Huffman codeword. The rest of the bits are 
ignored. Obviously, the key rate which is needed is L{X). 

B. Eve Has No Parsing Information 

In this subsection, we relax our security assumptions and 
assume that Eve observes the whole transmission from Alice 
to Bob, but has no information on how to parse the bitstream 
Bn into Zi , . . . , Z„ . Although it is not customary to limit 
the eavesdropper in any way in information-theoretic security, 
this limitation has a practical motivation, as discussed in the 
Introduction. 

We will require that the following holds for every t and 
every x £ X: 

P{Xt = x\Br,) > PxiXt - x) a.s. (9) 

This means that when the bitstream is long enough, the 
eavesdropper does not learn from it anything about the source 
symbols. Note that the encoder from Section (III-A2l i trivially 
satisfies this constraint since it was a constant block length 



encoder and the bits within the block where encrypted by 
a one-time pad. We will see that with the relaxed secrecy 
requirement we can reduce the rate of the encoder to be the 
same as the rate of the key. In this section we deal with 
encoders that satisfy Xt -B„ o K*~^. The discussion that 
followed the constraint (|2j is valid here as well. We have the 
following theorem: 

Theorem II. There exists a lossless encoder-decoder pair 
that satisfies the secrecy constraints (|9|l and only if R > 
L{X),Rk>L{X). 

The fact that R > L{X) is trivial since we deal with a real 
time lossless encoder However, unlike the case of Theorem 
U here it can be achieved along with Rk > L{X). The 
proof of the bound on Rk follows the proof of the previous 
section up to ^ by replacing Z* by i3„. We have: Rk > 
7:Et=iLiXt\Bn). Now, since P(Xt|B„) ^ P(Xt) a.s. 
we have that L{Xt\Bn) L{Xt) a.s.. The direct part 
of the proof is achieved by separation. We first encode Xt 
using a Huffman code and then XOR the resulting bits with 
a one time pad. Therefore, both the encoder and key rate 
of this scheme are equal to L{X). We need to show that 
(|9]l holds. We outline the idea here. The bits of i?„ are 
independent of Xt since we encrypted them with a one-time 
pad. Let Ib^ represent the number of bits in i3„. Since i?„ is 
encrypted we have Xt Ib„ ~^ B^. Therefore, we have that 
P{Xt\B^) = P{Xt\B.n^Bj = PiXtlhJ- From the law of 
large numbers, Ib^ — > nL{X) a.s. But if Ib^ — nL{X) then 
Ib^ leaks no information about Xt (since this nL{X) is known 
a-priori to Eve). The full proof resembles the martingale proof 
of the strong law of large numbers and can be found in ifTTI . 
Discussion: Unlike Theorem U Theorem addresses the rate 
of the encoder as well as the rate of the key. The result here 
is asymptotic since only when the bitsream is long enough we 
have the independence of Xt from i?„. It can be shown that 
the probability that i?„ reveals information on Xt vanishes 
exponentially fast with n. Note that if instead of defining the 
security constraint as in (|9), we would have required that for 
every n, t, P{Xt\Bn) = P{Xt) then a counterpart of Theorem 
1 wiU hold here. However, the encoder will, as in the direct 
part of Theorem J] proof , work in constant rate. 

III. Causal Rate Distortion with Security 
Constraints and SI 

In this section, we extend the work of [[D.S to include 
secrecy constraints. We consider the following source model: 
Alice, Bob, and Eve observe sequences of random vari- 
ables X", y, and respectively which take values over 
discrete alphabets X,y,W, respectively. (X",y",W^") are 
distributed according to a joint distribution p{x'^ , y" , w"^) = 
Ut=iPi^t)P{yt\xt)Piwt\yt), i.e., the triplets {Xt,Yt,Wt) 
are created by a DMS with the structure X ^ Y ^ W. 
(Y", W") are the SI sequences seen by Bob and Eve respec- 
tively. Unlike ||6l, |l9], we will treat in this paper only the case 
of degraded SI. This model covers the scenarios where no SI 
is available or is available only to Bob as special cases. Both 



Alice and Bob have access to a shared secret key denoted by 
K, K ^ {0, 1,2..., Mk] which is independent of the source. 

Let X be Bob's reproduction alphabet and d : X x 
X — > [0, oo), dmin = "cahix^x d{x, x). Finally, let d(a:", £") = 
■^^t^id{xt,Xt). Alice encodes X", using the key, K, and 
creates a bit sequence Z — Zi, Z2 . . . which is transmitted 
through a clean channel to Bob. Bob uses {K, Y" , Z) to create 
an estimate sequence, X", such that Ed{X" , X") < D. We 
allow the decoder to fail and declare an error with a vanishing 
probability of error Namely, for every S > there exists n 
large enough such that the probability of error is less than 5. 

We assume that Eve intercepts the transmitted bits, Z. 
The security of the system is measured by the uncertainty 
of Eve with regard to the source sequence, measured by 
j^H{X"\W", Z). As in fT\, we call the cascade of encoder 
and decoder a reproduction coder We say that a reproduction 
function is causal relative to the source if 



Xt = ft{X^^,K) = ftiX^^,K) if Xt^ = Xt 



(10) 



Note that we did not restrict the use of the key to be causal in 
any sense. Moreover, this definition does not rule out arbitrary 
delays and real-time is not considered here. We will only 
treat the SI model covered in [2] where in not used in 
the reproduction of X" but can be used for the compression 
of X". More complicated models will be treated in [11 1. 
A causal reproduction coder is characterized by a family of 
reproduction functions {fk}kLi^ such that the reproduction Xk 
of the fcth source output Xk is given by Xk ~ fk{K,X''). If 
the decoder declares an error, we will have X^. ^ fk{K, X^). 
The probability of this event is the probability of decoder 
error The average distortion of an encoder-decoder pair with 
an induced reproduction coder {fk} is defined by 



=limsup£; 



(11) 



The encoder's rate is defined by i? = ^ limsup„_^go H{Z). 

Let TZ denote the set of positive quadruples (i?, Rk, D, h) 
such that for every e > 0, 6 > and sufficiently large n, there 
exists an encoder and a decoder whose probability of error is 
less than 6, inducing a causal reproduction coder satisfying: 

-H{Z) <R + e 



H{K) <RK + e 



1 " 

-yEp{Xt,Xt)<D + e 



1 



-i/(X"|VF",Z) >h-e 



(12) 
(13) 
(14) 

(15) 



Let rj;\y{D) be the optimum performance theoretically attain- 
able function (OPTA) from |2| for the case where the SI is 
available only at the decoder. Namely, 

r^iyiD) = min H{f{X)\Y) (16) 
and let r^(-) denote the lower convex envelope of r^^y{-). 



We have the following theorem. 

Theorem III. {R, Rk,D,h) eTZ if and only if 

h < H{X\W), D > D,mn,R > r^{D), 
Rk>h-H{X\W) + r:^{D). (17) 

If h — H{X\W) + r^{D) < 0, no encryption is needed. 

It is seen from the theorem, that separation holds in this 
case. The direct part of this proof is therefore straightforward: 
First, quantize the source within distortion D by the scheme 
given in 121. As was shown in 111, this step requires time- 
sharing no more than two memoryless quantizers. Now use 
Slepian-Wolf encoding to encode the resulting quantized 
symbols given the SI at Bob. Finally, use a one-time pad of 
n{h — H{X\W) + f!^(_D)) bits on the block describing the 
bin number 

We now proceed to prove the converse part, starting with 
lower bounding the encoding rate. For any fc, let Xk = 
fk{K^X^). Xk are equal to Xk when there is no decoding 
error Since the probability of decoder failure vanishes, we 
have from Fano's inequality ( lfT2l ) that for every e > 0, there 
exists n large enough such that H{X^\K,Y" , Z) < ne. 

For n large enough and every encoder and decoder pair that 
induce a causal reproduction coder and satisfy (ITZt - dTsi the 
following chain of inequalities hold: 

nR > H{Z) 

> H{Z\K, r") H{Z\K, X", r") 

= r") - H{x''\K, r", z) 

> - ne (18) 

n 

= ^H{Xt\K,X'-^ .Y'') - ne 
t=i 

n 

> J2H{Xt\K,X'-\X'-^,Y'') - ne 
t=i 

n 

= Y,H{Xt\K,X'-\Y'-)~ne 

n 

= H{ft{X'-\K, Xt)\K, y") - ne (19) 

i=l 

where (fTsl l follows from Fano's inequality. From here, using 
the independent of the key and the source and following the 
steps used in |2, Appendix, eq. A. 11] we can show that R > 
r^{D). The key rate can be lower bounded as follows: 

nRK = H{K) 
> H{K\Z,W'^) 

= /(X"; K\W'', Z) + H{K\X'\ VF", Z) 
= Z) - H{X''\K, W", Z) + H{K\X'^, Z) 

>nh~ H{X''\K, VF", Z) + H{K\X'\ W, Z) 
>nh-H{X''\K,W"',Z) (20) 



We continue by focusing on W'\ Z): 

HiX'^lK, W^.Z) 

= /(X", Y'^\K, VI/", Z) + H{X'^\K, Y'^.W^, Z) 

(21) 

= i7(y"|VK") - + H{X''\K,Y'', Z) 

(22) 

= /(X"; r"|VF") + HiX^'lK, F", 1", Z) 

< nI{X;Y\W)+ H{X''\K,Y'^,X'\Z) + ne (23) 

< n{H{X\W) - H{X\Y)) + F", X") (24) 

where in (ISTT l we used the degraded structure of the source. 
(l22l i is true since Z is a function of {K, X") and K is 
independent of the souce. ( |23] l is true by Fano's inequality 
and the fact that X" is a function of (JC, X"). Focusing on 
the last term of (|24] | we have 

H{X''\K,Y",X") = iJ(X"|i^,r") 
= nH{X\Y) - /(X"; X''\K, F") 
= nH{X\Y) - y") + iJ(X"|ii:,X",F") 

<nH{X\Y)-H{X"\K,Y") + ne (25) 
<nHiX\Y)-r^{D)+e (26) 

where (l25l l is true since X" is a function of K, X" through 
the reproduction coders and Fano's inequality. Finally the last 
line follows from (fTsT l. Combining (l26T l with (|24] | into ( l20l l we 
showed that Rr > h - H{X\W) + r^{D). 
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